智能论文笔记

An Efficient Industrial Federated Learning Framework for AIoT: A Face Recognition Application

Youlong Ding , Xueyang Wu , Zhitao Li , Zeheng Wu , Shengqi Tan , Qian Xu , Weike Pan , Qiang Yang

分类：计算机视觉 | 机器学习

2022-06-21

最近，事物的人工智能（Aiot）一直在引起人们的关注，具有通过事物的网络连接提供高度智能服务的有趣愿景，从而导致了先进的AI驱动生态。但是，对数据隐私的最新监管限制排除将敏感的本地数据上传到数据中心，并以集中式方法利用它们。在这种情况下，直接应用联合学习算法几乎不能满足效率和准确性的工业要求。因此，我们在面部识别应用方面为AIOT提出了一个有效的工业联合学习框架。具体而言，我们建议利用转移学习的概念来加快设备上的联合培训，并进一步介绍私人投影仪的新颖设计，该设计有助于保护共享梯度，而不会产生额外的记忆消耗或计算成本。对亚洲私人面部数据集的实证研究表明，我们的方法仅在20轮沟通中就可以实现高认识的准确性，这表明了其在预测和培训方面的有效性。

translated by 谷歌翻译

WrapperFL: A Model Agnostic Plug-in for Industrial Federated Learning

Xueyang Wu , Shengqi Tan , Qian Xu , Qiang Yang

分类：机器学习 | 人工智能

2022-06-21

作为保护隐私的协作机器学习范式，联邦学习在行业中越来越受到关注。随着需求的巨大增长，有许多联合学习平台使联邦参与者可以从头开始建立并建立联合模型。但是，退出的平台高度侵入性，复杂且难以与建造的机器学习模型集成。对于许多已经具有成熟服务模型的现实世界企业，现有的联合学习平台具有很高的进入障碍和发展成本。本文介绍了一个简单而实用的联合学习插件，其灵感来自合奏学习，被称为包装，使参与者能够以最低的成本建立/加入使用现有模型的联合系统。 Wrapperfl通过简单地将其连接到现有模型的输入和输出接口，而无需重新开发，从而大大减少了人力和资源的开销。我们在异质数据分布和异质模型下验证我们的建议方法。实验结果表明，在实际设置下，包装可以成功地应用于广泛的应用程序，并以低成本的联合学习改善本地模型。

translated by 谷歌翻译

Finding the Most Transferable Tasks for Brain Image Segmentation

Yicong Li , Yang Tan , Jingyun Yang , Yang Li , Xiao-Ping Zhang

分类：人工智能 | 计算机视觉 | 机器学习

2023-01-03

Although many studies have successfully applied transfer learning to medical image segmentation, very few of them have investigated the selection strategy when multiple source tasks are available for transfer. In this paper, we propose a prior knowledge guided and transferability based framework to select the best source tasks among a collection of brain image segmentation tasks, to improve the transfer learning performance on the given target task. The framework consists of modality analysis, RoI (region of interest) analysis, and transferability estimation, such that the source task selection can be refined step by step. Specifically, we adapt the state-of-the-art analytical transferability estimation metrics to medical image segmentation tasks and further show that their performance can be significantly boosted by filtering candidate source tasks based on modality and RoI characteristics. Our experiments on brain matter, brain tumor, and white matter hyperintensities segmentation datasets reveal that transferring from different tasks under the same modality is often more successful than transferring from the same task under different modalities. Furthermore, within the same modality, transferring from the source task that has stronger RoI shape similarity with the target task can significantly improve the final transfer performance. And such similarity can be captured using the Structural Similarity index in the label space.

translated by 谷歌翻译

Neural Collapse in Deep Linear Network: From Balanced to Imbalanced Data

Hien Dang , Tan Nguyen , Tho Tran , Hung Tran , Nhat Ho

分类：机器学习 | (统计)机器学习

2023-01-01

Modern deep neural networks have achieved superhuman performance in tasks from image classification to game play. Surprisingly, these various complex systems with massive amounts of parameters exhibit the same remarkable structural properties in their last-layer features and classifiers across canonical datasets. This phenomenon is known as "Neural Collapse," and it was discovered empirically by Papyan et al. \cite{Papyan20}. Recent papers have theoretically shown the global solutions to the training network problem under a simplified "unconstrained feature model" exhibiting this phenomenon. We take a step further and prove the Neural Collapse occurrence for deep linear network for the popular mean squared error (MSE) and cross entropy (CE) loss. Furthermore, we extend our research to imbalanced data for MSE loss and present the first geometric analysis for Neural Collapse under this setting.

translated by 谷歌翻译

PAC-Bayesian-Like Error Bound for a Class of Linear Time-Invariant Stochastic State-Space Models

Deividas Eringis , John Leth , Zheng-Hua Tan , Rafal Wisniewski , Mihaly Petreczky

分类： (统计)机器学习 | 机器学习

2022-12-30

In this paper we derive a PAC-Bayesian-Like error bound for a class of stochastic dynamical systems with inputs, namely, for linear time-invariant stochastic state-space models (stochastic LTI systems for short). This class of systems is widely used in control engineering and econometrics, in particular, they represent a special case of recurrent neural networks. In this paper we 1) formalize the learning problem for stochastic LTI systems with inputs, 2) derive a PAC-Bayesian-Like error bound for such systems, 3) discuss various consequences of this error bound.

translated by 谷歌翻译

ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech

Zehua Chen , Yihan Wu , Yichong Leng , Jiawei Chen , Haohe Liu , Xu Tan , Yang Cui , Ke Wang , Lei He , Sheng Zhao

分类：自然语言处理 | 机器学习

2022-12-30

Denoising Diffusion Probabilistic Models (DDPMs) are emerging in text-to-speech (TTS) synthesis because of their strong capability of generating high-fidelity samples. However, their iterative refinement process in high-dimensional data space results in slow inference speed, which restricts their application in real-time systems. Previous works have explored speeding up by minimizing the number of inference steps but at the cost of sample quality. In this work, to improve the inference speed for DDPM-based TTS model while achieving high sample quality, we propose ResGrad, a lightweight diffusion model which learns to refine the output spectrogram of an existing TTS model (e.g., FastSpeech 2) by predicting the residual between the model output and the corresponding ground-truth speech. ResGrad has several advantages: 1) Compare with other acceleration methods for DDPM which need to synthesize speech from scratch, ResGrad reduces the complexity of task by changing the generation target from ground-truth mel-spectrogram to the residual, resulting into a more lightweight model and thus a smaller real-time factor. 2) ResGrad is employed in the inference process of the existing TTS model in a plug-and-play way, without re-training this model. We verify ResGrad on the single-speaker dataset LJSpeech and two more challenging datasets with multiple speakers (LibriTTS) and high sampling rate (VCTK). Experimental results show that in comparison with other speed-up methods of DDPMs: 1) ResGrad achieves better sample quality with the same inference speed measured by real-time factor; 2) with similar speech quality, ResGrad synthesizes speech faster than baseline methods by more than 10 times. Audio samples are available at https://resgrad1.github.io/.

translated by 谷歌翻译

SESNet: sequence-structure feature-integrated deep learning method for data-efficient protein engineering

Mingchen Li , Liqi Kang , Yi Xiong , Yu Guang Wang , Guisheng Fan , Pan Tan , Liang Hong

分类：机器学习

2022-12-29

Deep learning has been widely used for protein engineering. However, it is limited by the lack of sufficient experimental data to train an accurate model for predicting the functional fitness of high-order mutants. Here, we develop SESNet, a supervised deep-learning model to predict the fitness for protein mutants by leveraging both sequence and structure information, and exploiting attention mechanism. Our model integrates local evolutionary context from homologous sequences, the global evolutionary context encoding rich semantic from the universal protein sequence space and the structure information accounting for the microenvironment around each residue in a protein. We show that SESNet outperforms state-of-the-art models for predicting the sequence-function relationship on 26 deep mutational scanning datasets. More importantly, we propose a data augmentation strategy by leveraging the data from unsupervised models to pre-train our model. After that, our model can achieve strikingly high accuracy in prediction of the fitness of protein mutants, especially for the higher order variants (> 4 mutation sites), when finetuned by using only a small number of experimental mutation data (<50). The strategy proposed is of great practical value as the required experimental effort, i.e., producing a few tens of experimental mutation data on a given protein, is generally affordable by an ordinary biochemical group and can be applied on almost any protein.

translated by 谷歌翻译

Differentiable Search of Accurate and Robust Architectures

Yuwei Ou , Xiangning Xie , Shangce Gao , Yanan Sun , Kay Chen Tan , Jiancheng Lv

分类：机器学习 | 人工智能

2022-12-28

Deep neural networks (DNNs) are found to be vulnerable to adversarial attacks, and various methods have been proposed for the defense. Among these methods, adversarial training has been drawing increasing attention because of its simplicity and effectiveness. However, the performance of the adversarial training is greatly limited by the architectures of target DNNs, which often makes the resulting DNNs with poor accuracy and unsatisfactory robustness. To address this problem, we propose DSARA to automatically search for the neural architectures that are accurate and robust after adversarial training. In particular, we design a novel cell-based search space specially for adversarial training, which improves the accuracy and the robustness upper bound of the searched architectures by carefully designing the placement of the cells and the proportional relationship of the filter numbers. Then we propose a two-stage search strategy to search for both accurate and robust neural architectures. At the first stage, the architecture parameters are optimized to minimize the adversarial loss, which makes full use of the effectiveness of the adversarial training in enhancing the robustness. At the second stage, the architecture parameters are optimized to minimize both the natural loss and the adversarial loss utilizing the proposed multi-objective adversarial training method, so that the searched neural architectures are both accurate and robust. We evaluate the proposed algorithm under natural data and various adversarial attacks, which reveals the superiority of the proposed method in terms of both accurate and robust architectures. We also conclude that accurate and robust neural architectures tend to deploy very different structures near the input and the output, which has great practical significance on both hand-crafting and automatically designing of accurate and robust neural architectures.

translated by 谷歌翻译

Semi-Supervised Semantic Segmentation Methods for UW-OCTA Diabetic Retinopathy Grade Assessment

Zhuoyi Tan , Hizmawati Madzin , Zeyu Ding

分类：计算机视觉

2022-12-27

People with diabetes are more likely to develop diabetic retinopathy (DR) than healthy people. However, DR is the leading cause of blindness. At present, the diagnosis of diabetic retinopathy mainly relies on the experienced clinician to recognize the fine features in color fundus images. This is a time-consuming task. Therefore, in this paper, to promote the development of UW-OCTA DR automatic detection, we propose a novel semi-supervised semantic segmentation method for UW-OCTA DR image grade assessment. This method, first, uses the MAE algorithm to perform semi-supervised pre-training on the UW-OCTA DR grade assessment dataset to mine the supervised information in the UW-OCTA images, thereby alleviating the need for labeled data. Secondly, to more fully mine the lesion features of each region in the UW-OCTA image, this paper constructs a cross-algorithm ensemble DR tissue segmentation algorithm by deploying three algorithms with different visual feature processing strategies. The algorithm contains three sub-algorithms, namely pre-trained MAE, ConvNeXt, and SegFormer. Based on the initials of these three sub-algorithms, the algorithm can be named MCS-DRNet. Finally, we use the MCS-DRNet algorithm as an inspector to check and revise the results of the preliminary evaluation of the DR grade evaluation algorithm. The experimental results show that the mean dice similarity coefficient of MCS-DRNet v1 and v2 are 0.5161 and 0.5544, respectively. The quadratic weighted kappa of the DR grading evaluation is 0.7559. Our code will be released soon.

translated by 谷歌翻译

PaletteNeRF: Palette-based Color Editing for NeRFs

Qiling Wu , Jianchao Tan , Kun Xu

分类：计算机视觉

2022-12-25

Neural Radiance Field (NeRF) is a powerful tool to faithfully generate novel views for scenes with only sparse captured images. Despite its strong capability for representing 3D scenes and their appearance, its editing ability is very limited. In this paper, we propose a simple but effective extension of vanilla NeRF, named PaletteNeRF, to enable efficient color editing on NeRF-represented scenes. Motivated by recent palette-based image decomposition works, we approximate each pixel color as a sum of palette colors modulated by additive weights. Instead of predicting pixel colors as in vanilla NeRFs, our method predicts additive weights. The underlying NeRF backbone could also be replaced with more recent NeRF models such as KiloNeRF to achieve real-time editing. Experimental results demonstrate that our method achieves efficient, view-consistent, and artifact-free color editing on a wide range of NeRF-represented scenes.

translated by 谷歌翻译